A Right-to-Left Chart Parsing for Dependency Grammar using Headable Paths
نویسندگان
چکیده
In this paper, we propose a right-to-left dependency grammar parsing method for languages in which a governor appears after its modiier like Korean and Japanese. Unlike conventional left-to-right parsers, this parsing method can take advantage of the governor post-positioning property of such languages to reduce the size of search space by using the idea of a head-able path. A headable path is a path which contains all candidate words which can be the governor of an input word during parsing. A preliminary experiment showed very promising results in reducing the number of intermediate dependency structures and actual parsing time. The degrees of freedom in word-order vary with each language. Languages like English, French or Chi-nese are relatively xed word-order languages, while languages like German are less xed ones. On the other hand, languages like Korean, Japanese, Russian or Finish are relatively free word-order languagess1, 4]. It has been argued that phrase structure gram-mar(PSG) which has been popular for the xed word-order languages is not well suited to the relatively free word order languagess5]. Dependency grammar(DG) which was treated in detail rst by L. Tesniere is a grammar formalism that describes the dependency relations between words in a sentence. DG does not concern about the syntactic structure of a sentence. It rather focuses on the relationship that which word modiies or depends on which other one. In this sense, DG is much weaker than PSG in describing the syntax of a language. However, this property makes DG more suitable for describing the syntax of relatively free word order languages such as Korean and Japanese. In this paper, we propose an eecient parsing method, called the Right-to-Left Chart Parser for Dependency Grammar (RLCP-DG) using headable paths. A headable path is a path that is deened at each parsing step for each input word and that contains all candidate words which can be the governors of an input word. RLCP-DG analyzes sentences from right to left by using the governor post-positioning property in Altaic languages like Korean. This property is that a modiiee, i.e., a modiied word, appears almost always after its modiier in a sentence. This property is somewhat general in Altaic languages. For the presentation purpose, we will call a modiiee as a governor and a modiier as a dependent. Parsing a sentence with DG means nding the dependency relations between the words in a sentence. If the parser makes a …
منابع مشابه
A Right - to - Left Chart
In this paper, we propose a right-to-left dependency grammar parsing method for languages in which a governor appears after its modiier like Korean and Japanese. Unlike conventional left-to-right parsers, this parsing method can take advantage of the governor post-positioning property of such languages to reduce the size of search space by using the idea of a headable path. A headable path is a...
متن کاملTreebank Grammar Techniques for Non-Projective Dependency Parsing
An open problem in dependency parsing is the accurate and efficient treatment of non-projective structures. We propose to attack this problem using chart-parsing algorithms developed for mildly contextsensitive grammar formalisms. In this paper, we provide two key tools for this approach. First, we show how to reduce nonprojective dependency parsing to parsing with Linear Context-Free Rewriting...
متن کاملPositive Results for Parsing with a Bounded Stack using a Model-Based Right-Corner Transform
Statistical parsing models have recently been proposed that employ a bounded stack in timeseries (left-to-right) recognition, using a rightcorner transform defined over training trees to minimize stack use (Schuler et al., 2008). Corpus results have shown that a vast majority of naturally-occurring sentences can be parsed in this way using a very small stack bound of three to four elements. Thi...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملTreebank-Based Acquisition of Chinese LFG Resources for Parsing and Generation
This thesis describes a treebank-based approach to automatically acquire robust, wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena an...
متن کامل